Human DNA topoisomerase I poisoning causes R loop–mediated genome instability attenuated by transcription factor IIS

DNA topoisomerase I can contribute to cancer genome instability. During catalytic activity, topoisomerase I forms a transient intermediate, topoisomerase I–DNA cleavage complex (Top1cc) to allow strand rotation and duplex relaxation, which can lead to elevated levels of DNA-RNA hybrids and micronuclei. To comprehend the underlying mechanisms, we have integrated genomic data of Top1cc-triggered hybrids and DNA double-strand breaks (DSBs) shortly after Top1cc induction, revealing that Top1ccs increase hybrid levels with different mechanisms. DSBs are at highly transcribed genes in early replicating initiation zones and overlap with hybrids downstream of accumulated RNA polymerase II (RNAPII) at gene 5′-ends. A transcription factor IIS mutant impairing transcription elongation further increased RNAPII accumulation likely due to backtracking. Moreover, Top1ccs can trigger micronuclei when occurring during late G1 or early/mid S, but not during late S. As micronuclei and transcription-replication conflicts are attenuated by transcription factor IIS, our results support a role of RNAPII arrest in Top1cc-induced transcription-replication conflicts leading to DSBs and micronuclei.

(combined tracks for NT in green and for 10' CPT in blue), Top1 ChIP-seq and Top1cc-seq levels at representative genomic loci for each R-loop kinetic category.as reported in Figure 1D.DRIP peak row indicate the peak region.B. Metaplots of DRIP-seq and Top1cc-seq normalize levels at Stable and Transient R-loop GAIN regions of genes."Center" represent the R-loop peak center, in a window of -/+ 10 kbp.Line colors as in legend.C. Hybrid detection with the R-ChIP method at four R-loop positive loci and a negative on.Bars represent at least 3 biological replicates of untreated cells and are normalized on HeLa cells stably overexpressing WT RNaseH1 protein (black bar).As expected, data show a specific R-loop recovery only in cells overexpressing D210N mutant RNaseH1.Color code as in the legend.

Figure S3. RNAPII accumulation induced by short exposures of HEK293 cells to camptothecin.
A. Integrative Genomics Viewer (IGV) screenshots of DRIP-seq (NT, 5' CPT and 10' CPT), RNAPII ChIP-seq (combined tracks for NT in green and for 10'CPT in blue), and Top1cc-seq levels at representative genomic loci for stable and transient R-loop GAIN divided in promoterassociated (up to -10 kbp from promoter and 5'UTR), genebody-associated (located in exon and intron) and terminator associated (located in 3'UTR and Downstream 10 kbp) regions of genes, as reported in Figure 3A  Three biological replicates are reported.Statistical significance as in (A).C. Workflow of analyses of END-Seq read sequences.After quality control, reads were aligned on human and mouse reference genomes and filtered for MAPQ.Aligned reads were then split for the first-in pair orientation (forward reads, mapping on the + strand, are reported in pink, while in violet are reported reverse reads, mapping on -strand).Peak calling has been performed separately for forward and reverse reads.Red and blue rectangles represent peaks for forward and reverse reads.After peak calling, forward and reverse peaks resembling the whole END-Seq signal were identified and then, start coordinates of reverse peaks (in blue) and end coordinates of the closest forward peaks (in red) were used to recreate the whole END-Seq peaks for each replicate (in green).After the creation of the universe of peaks, reads covering peaks were calculated and normalized.Then, a differential expression analysis was performed between treated and control conditions.

Figure S2 .
Figure S2.Representative genomic loci of hybrid categories.A. Integrative Genomics Viewer (IGV) screenshots of DRIP-seq (NT, 5' CPT and 10' CPT), GRO-seq, RNAPII ChIP-seq Figure S3.RNAPII accumulation induced by short exposures of HEK293 cells to camptothecin.A. Integrative Genomics Viewer (IGV) screenshots of DRIP-seq (NT, 5' CPT and 10' CPT), RNAPII ChIP-seq (combined tracks for NT in green and for 10'CPT in blue), and Top1cc-seq levels at representative genomic loci for stable and transient R-loop GAIN divided in promoterassociated (up to -10 kbp from promoter and 5'UTR), genebody-associated (located in exon and intron) and terminator associated (located in 3'UTR and Downstream 10 kbp) regions of genes, as reported in Figure3A.DRIP peak row indicate the peak region.B. Possible mechanism of RNAPII accumulation and hybrid formation caused by Top1cc.C. Total protein staining as loading control for TFIISm expression induced by doxycycline (+).D-E.Bar plot of RNAPII and TFIIS levels at transcribed (MYC) or not transcribed (α-sat and DACH2) loci determined by ChIP (antibodies sc-47701 and ab185947, respectively).HEK293 cells were treated with CPT for 10 minutes.TFIISm was induced for 48 hours with doxycycline.DNA enrichment over input was quantified by realtime PCR.Data are from at least three biological replicates.Statistical analysis was performed using multiple unpaired t-test.Asterisks show p-values as follows: * , <0.05; * * , <0.01; * * * , <0.001; and * * * * , <0.0001.F. R-loop levels determined by DRIP-qPCR at indicated loci in not induced and induced TFIISm HEK293 cells treated with RNAseH1.DRIP enrichment over input was normalized on pFC53 plasmid spike-in (RF amplicon).At least three biological replicates are reported.Each bar represents mean values ± SEM and statistical tests were performed with onetailed ratio paired t test.Asterisks show p-values as follows: * , <0.05; * * , <0.01; * * * , <0.001; and * * * * , <0.0001.At least three biological replicates are reported.

Figure S4 .
Figure S4.DNA cleavage sites induced by short exposures of HCT116 cells to camptothecin.A. DSBs are detected in HCT116 cells after 10 minutes of 10 μM CPT and increase at 20 minutes by neutral Comet assay.At least 230 cells have been observed.Three biological replicates are reported.Statistical significance was calculated considering the distributions of treated samples vs untreated samples with the two-tailed Mann-Whitney test.p values are: * , <0.05; * * , <0.01; * * * , <0.001; and * * * * , <0.0001.Representative cell images are shown on the right.B. Alkaline Comet assay performed in HCT116 cells as in (A).At least 250 cells have been observed.Single-strand breaks are already at maximum levels after 10 minutes of treatment and can derive from Top1ccs.Three biological replicates are reported.Statistical significance as in (A).C. Workflow of analyses of END-Seq read sequences.After quality control, reads were aligned on human and mouse reference genomes and filtered for MAPQ.Aligned reads were then split for the first-in pair orientation (forward reads, mapping on the + strand, are reported in pink, while in violet are reported reverse reads, mapping on -strand).Peak calling has been performed separately for forward and reverse reads.Red and blue rectangles represent peaks for forward and reverse reads.After peak calling, forward and reverse peaks resembling the whole END-Seq signal were identified and then, start coordinates of reverse peaks (in blue) and end coordinates of the closest forward peaks (in red) were used to recreate the whole END-Seq peaks for each replicate (in green).After the creation of the universe of peaks, reads covering peaks were calculated and normalized.Then, a differential expression analysis was performed between treated and control conditions.D. Volcano plots of END-Seq peaks in 10' CPT vs NT (left) and 20' CPT vs NT (right) increased (red) or decreased (blue) in CPT-treated cells.Each dot represents an END-Seq peak.Colors as in legend.Vertical dotted lines indicate thresholds of log2(fold-change) > 1 and log2(foldchange) < -1 (x-axis).The horizontal dotted line indicates a threshold of p-value < 0.05, indicated as -log10(p-value) (y-axis).E. Venn diagram reporting numbers of DSBs after 10' CPT or 20' CPT treatment.Persistent DSB clusters are those in common between the two time points.F. Metaplots and heatmaps of END-Seq normalized signal of pair-oriented reads (+ strand, blue and green;strand, light-blue and orange) at significantly higher DSB clusters after the two studied times: 10' CPT vs NT (left); 20' CPT vs NT (right)."c" indicates the center of DSB clusters.G.The genomic locus of a representative persistent DSB cluster reporting normalized read signals for 10' CPT (light blue), 20' CPT (blue) and NT (orange) samples on both plus (+) and minus (-) strands.H. Integrative Genomics Viewer (IGV) screenshots of END-seq (NT, 10' CPT and 20' CPT), GROseq, RNAPII ChIP-seq (combined tracks for NT in green and for 10'CPT in blue), Top1cc-seq, different histone marks ChIP-seq and DNase-seq levels at representative genomic loci for each DSB cluster kinetic category.as reported in Figure 4A,D and E. END peak row indicate the DSB cluster region.
Figure S4.DNA cleavage sites induced by short exposures of HCT116 cells to camptothecin.A. DSBs are detected in HCT116 cells after 10 minutes of 10 μM CPT and increase at 20 minutes by neutral Comet assay.At least 230 cells have been observed.Three biological replicates are reported.Statistical significance was calculated considering the distributions of treated samples vs untreated samples with the two-tailed Mann-Whitney test.p values are: * , <0.05; * * , <0.01; * * * , <0.001; and * * * * , <0.0001.Representative cell images are shown on the right.B. Alkaline Comet assay performed in HCT116 cells as in (A).At least 250 cells have been observed.Single-strand breaks are already at maximum levels after 10 minutes of treatment and can derive from Top1ccs.Three biological replicates are reported.Statistical significance as in (A).C. Workflow of analyses of END-Seq read sequences.After quality control, reads were aligned on human and mouse reference genomes and filtered for MAPQ.Aligned reads were then split for the first-in pair orientation (forward reads, mapping on the + strand, are reported in pink, while in violet are reported reverse reads, mapping on -strand).Peak calling has been performed separately for forward and reverse reads.Red and blue rectangles represent peaks for forward and reverse reads.After peak calling, forward and reverse peaks resembling the whole END-Seq signal were identified and then, start coordinates of reverse peaks (in blue) and end coordinates of the closest forward peaks (in red) were used to recreate the whole END-Seq peaks for each replicate (in green).After the creation of the universe of peaks, reads covering peaks were calculated and normalized.Then, a differential expression analysis was performed between treated and control conditions.D. Volcano plots of END-Seq peaks in 10' CPT vs NT (left) and 20' CPT vs NT (right) increased (red) or decreased (blue) in CPT-treated cells.Each dot represents an END-Seq peak.Colors as in legend.Vertical dotted lines indicate thresholds of log2(fold-change) > 1 and log2(foldchange) < -1 (x-axis).The horizontal dotted line indicates a threshold of p-value < 0.05, indicated as -log10(p-value) (y-axis).E. Venn diagram reporting numbers of DSBs after 10' CPT or 20' CPT treatment.Persistent DSB clusters are those in common between the two time points.F. Metaplots and heatmaps of END-Seq normalized signal of pair-oriented reads (+ strand, blue and green;strand, light-blue and orange) at significantly higher DSB clusters after the two studied times: 10' CPT vs NT (left); 20' CPT vs NT (right)."c" indicates the center of DSB clusters.G.The genomic locus of a representative persistent DSB cluster reporting normalized read signals for 10' CPT (light blue), 20' CPT (blue) and NT (orange) samples on both plus (+) and minus (-) strands.H. Integrative Genomics Viewer (IGV) screenshots of END-seq (NT, 10' CPT and 20' CPT), GROseq, RNAPII ChIP-seq (combined tracks for NT in green and for 10'CPT in blue), Top1cc-seq, different histone marks ChIP-seq and DNase-seq levels at representative genomic loci for each DSB cluster kinetic category.as reported in Figure 4A,D and E. END peak row indicate the DSB cluster region.

Figure S5 .
Figure S5.High transcription levels distinguish stable GAIN hybrids associated with DNA cleavage from those without DNA cleavage.A. The genomic locus of a representative persistent single-ended DSB reporting normalized read signals for 10' CPT (light blue), 20' CPT (blue) and NT (orange) samples on both plus (+) and minus (-) strands.B. Metaplots and heatmaps of END-

Figure S7 .
Figure S7.Hybrids are enriched at class 1 TAD boundaries. A. Metaplot of non-treated (NT), 5' CPT and 60' CPT DRIP-Seq signal at TAD boundaries classes."RH" indicates samples treated with RNaseH1.Colors as in legend."Bound."indicates the TAD boundary in a window of +/-0.1Mb.DRIP-seq levels are reported as mean of normalized read density (n.r.d.).B. Metaplot of non-treated (NT), 5' CPT and 60' CPT DRIP-Seq signal normalized levels at class1 TAD boundaries in a window of +/-50 kb.Colors as in legend."Bound."indicate TAD Start Site and TAD End Site.Class 1 boundaries were divided into four k-means clusters on the base of DRIPseq levels.This categorization allows to highlight that most of class 1 boundaries (clusters 3 and 4) have very low level of DRIP-seq, while clusters 1 and 2, that are fewer, have high level of DRIPseq signal that is enriched asymmetrically between the two genomic regions upstream and downstream of the TAD.C. Metaplots of DRIP-Seq, RNAPII and END-Seq at gene regions associated to Stable Gain hybrids with Persistent DSBs at Class1 TAD and Early IZ (left) and at gene regions associated to Stable Gain hybrids without Persistent DSBs at Class1 TAD and Early IZ (right).Colors as in legend.TSS and TES indicate Transcription Start Site and Transcription End Site of genes.D. Representative images of RNAPII:PCNA proximity ligation assay (PLA) with relative negative control samples (Ctrl RNAPII/PCNA).E. PLA of Proliferating Cell Nuclear Antigen (PCNA, Ab PC11, sc-53407) with RNAPII (Ab H-224, sc-9001) in cells expressing (+ doxy) or not (-doxy) TFIISm.After TFIISm over-expression cells were treated with 1 µM triptolide (TRP), with 12.5 µg/mL cordycepin (COR) or with 1 µM aphidicolin (APH) for 2 hours prior to 10 minutes of CPT treatment.Fold change of CPT treated versus untreated samples and doxycycline-induced versus not doxycycline-induced samples is reported.Each bar represents the mean value ± SEM of three independent experiments.Average number of cells analyzed is 400.Statistical significance was calculated with one-tailed paired t-test.p values are: * , <0.05; * * , <0.01, * * * , <0.001, * * * * , <0.0001.

Figure S8 .
Figure S8.Top1cc induction during non-S phase can trigger mitotic errors and micronuclei mediated by DNA-RNA hybrids. A. Representative images of both EdU+/-cells and EdU+/-

Figure S9 .
Figure S9.Elongation factors TFIISm can increase micronuclei when Top1cc induction occurs in non-S phase cells.A. Representative images of dual labelled TFIISm expressing HEK293 cells in different cell cycle phases.B-C.Micronuclei quantitation in HEK293 cells, overexpressing or not TFIISm, that were in late S-phase (B) or non-S phase (C) during 1h CPT administration.For each bar plot the number of micronuclei is reported as micronuclei/100 cells.Average number of cells is 120 (B) and 700 (C).Each bar represents the mean value ± SEM.Statistical significance was calculated comparing micronuclei distribution of treated over non-